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We develop a formalism and present an algorithm for optimization of the trial wave-function used in fixed- 
node diffusion quantum Monte Carlo (DMC) methods. The formalism is based on the DMC mixed estimator 
of the ground-state probability density. We take advantage of a basic property of the walker configuration 
distribution generated in a DMC calculation, to (i) project-out a multi-determinant expansion of the fixed- 
node ground-state wave function and (ii) to define a cost function that relates the fixed-node ground-state and 
the non-interacting trial wave functions. We show that (a) locally smoothing out the kink of the fixed-node 
ground-state wave-function at the node generates a new trial wave-function with better nodal structure and (b) 
we argue that the noise in the fixed-node wave-function resulting from finite sampling plays a beneficial role, 
allowing the nodes to adjust towards the ones of the exact many-body ground state in a simulated annealing- 
like process. Based on these principles, we propose a method to improve both single determinant and multi- 
determinant expansions of the trial wave-function. The method can be generalized to other wave-function 
forms such as pfaffians. We test the method in a model system where benchmark configuration interaction 
calculations can be performed and most components of the Hamiltonian are evaluated analytically. Comparing 
the DMC calculations with the exact solutions, we find that the trial wave-function is systematically improved. 
The overlap of the optimized trial wave function and the exact ground state converges to 100% even starting from 
wave-functions orthogonal to the exact ground state. Similarly, the DMC total energy and density converges to 
the exact solutions for the model. In the optimization process we find an optimal non-interacting nodal potential 
of density-functional-like form whose existence was predicted in a previous publication [Phys. Rev. B 77, 
2451 10 (2008)]. Tests of the method are extended to a model system with a conventional Coulomb interaction 
where we show we can obtain the exact Kohn-Sham effective potential from the DMC data. 



PACS numbers: 

I. INTRODUCTION 

In diffusion quantum Monte Carlo (DMC) a trial wave- 
function is used to enforce both the antisymmetry of the elec- 
tronic many-body wave-functioni2ii and the nodal structure 
of the solution. In highly correlated materials, the accuracy of 
the trial wave-function becomes increasingly important and 
determines the success or failure of the method. Indeed, con- 
cerns about the fixed-node accuracy have tended to limit ap- 
plications of DMC to pre-transition metal elements. The dis- 
covery and development of new methods to improve the trial 
wave-functions, ideally without great computational expense, 
is consequently highly desirable for almost all DMC calcula- 
tions. 

In DMC calculations the trial wave-function v&t^R) is 
commonly a product of an antisymmetric function $r(R) and 
a Jastrow factor e J ( R \ Usually <j>r(R) is a Slater determi- 
nant constructed with single particle Kohn-Sham orbitals from 
density functional theory (DFT) or from other mean field ap- 
proaches such as Hartree-Fock. The Jastrow factor does not 
change the nodes, but accelerates convergence and improves 
the algorithm's numerical stability. The Jastrow factor is op- 
timized in a previous variational Monte Carlo (VMC) calcu- 
lation. The DMC algorithm finds the lowest energy of the set 
of all wave-functions that share the nodes of \&t(R)- The 
exact ground-state energy will be obtained only if the exact 
nodes are provided. Since any change to an antisymmetric 



wave-function must result in a higher energy than the anti- 
symmetric ground state, the energy obtained with arbitrary 
nodes is an upper bound to the exact ground-state energy^ 
Only in small systems is it currently possible to improve 
the node a 5 ' 6,7,8,9 or even avoid the trial wave-function ap- 
proach altogether. lft11 ' 12 For small or weakly correlated sys- 
tems, where other numerical approaches can compete, the util- 
ity of DMC as a method depends crucially on the accuracy 
of the trial wave-function. Multiple determinant, pfaffian^ 
and back-flow 8 - wave-functions and geminal products^ 3 - are in- 
creasingly popular due to the improved accuracy. 

To improve the DMC energy one must improve the nodal 
surface of the trial wave-function. However, to our knowl- 
edge, all algorithms for wave-function optimization are based 
on the VMC approach, with any improvement in the DMC en- 
ergy occurring only as a side-effect. The use of VMC might 
be a limitation since VMC samples more frequently the re- 
gions of the wave-function that have larger probability den- 
sity and are thus far from the nodes X Accordingly, VMC 
based optimization methods improve first the wave-function 
at regions which are far from the nodes, while the nodes are 
only improved indirectly. It has been found, however, that 
VMC based optimization methods, in general, also improve 
the DMC energyiiil Nevertheless, a direct optimization of the 
DMC energy is desirable, and might have improved conver- 
gence properties compared to current indirect approaches. 

While it has been shown by us and others that, within the 
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single Slater determinant approach, the computational cost of 
an electronic update step in the DMC algorithm can have an 
almost linear scaling with the number of electrons , is i 16 i 17 the 
use of these methods is limited if we do not find a better source 
of trial wave-functions than those obtained from mean-field 
approaches such as DFT. We recently showed 18 that Kohn- 
Sham DFT wave-functions cannot be expected to yield good 
nodes in general. As correlations increase, Kohn-Sham DFT 
wave-functions can be bad sources of nodal surfaces^ In- 
deed, we also found that as the size of the system increases, 
the nodal error of DFT wave-functions might be of the order of 
the triplet excitation energies, precluding the prediction of ac- 
curate optical properties^ even for simple carbon fullerenes. 
Accordingly, it is highly desirable to find a method to (i) ob- 
tain trial wave-functions with accurate nodal structures, (ii) re- 
tain the simplicity of a mean field approach, or (iii) use a min- 
imum number of Slater determinants i.e., the wave-functions 
are compact and easily evaluated, (iv) directly optimize the 
nodes in DMC, and (v) improve the nodal structure systemat- 
ically independently of the starting point. In this contribution 
we provide such a method. 

In order to use DMC to find the best trial wave-function we 
overcome two major obstacles: (i) obtain a representation of 
the fixed-node ground-state DMC wave-function suitable for 
optimization of the nodes, and (ii) find a method to keep the 
trial wave-function compact in large systems by minimizing 
the number of determinants. 

This work is the natural continuation of a recent article 
(Ref. fl~8h where we proved the existence of an optimal effec- 
tive nodal potential for generating the orbitals in the determi- 
nants in the trial wave-function used in DMC. While some 
details are rederived here, we recommend reading Ref. |T8| be- 
fore this article. We previously proved^ that specific prop- 
erties of the interacting ground state can be retained via 
minimization of cost functions in the set of pure-state non- 
interacting densities. Each cost function defines the gradient 
of an effective non-interacting potential which is optimized 
in a Newton-Raphson-like approach until the cost function 
reaches a minimum. In this paper we take the next step: we 
use known properties of the walker distribution function gen- 
erated in a DMC run to define a cost function relating the 
non-interacting wave-functions with the fixed-node ground- 
state wave-function. This allows us to obtain, for example, 
the Kohn-Sham potential or an effective nodal potential from 
the DMC calculation. The method appears to be limited by 
the quality of the fit, the statistics that one can collect in DMC 
and the representability of the nodal surface, which becomes 
increasingly more demanding as the number of electrons in 
the system increases. Although this might limit the applica- 
bility of the method to systems with small electron counts, we 
note that DMC is readily parallelized with excellent scaling 
on modern computers. We also expect that improved sam- 
pling and optimization schemes can be constructed using the 
initial ideas and methods presented here and in Ref. [HI 

The remainder of this paper is organized as follows. In Sec- 
tion [II] we demonstrate that the nodes can be improved by lo- 
cally removing the kinks in the fixed-node ground state. In 
Section [HI] we derive a formalism and a method to obtain a 
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FIG. 1: a) (Color online) Schematic representation of trial wave- 
function O^t, blue dots), fixed-node ground-state (*&fjv , purple con- 
tinuous), ground-state ($, black dash and dots), and new trial wave- 
function O^t, red dashed line) in the direction perpendicular to the 
nodal surface (x). We show that smoothing the kink in the fixed- 
node wave-function ^fn moves the nodes of towards the nodes 
of the ground state . b) Schematic representation of how the nodal 
surface evolves, shown with increasing purple line thickness, after 
each iteration in the algorithm. The noise introduced in the nodes by 
random fluctuations of the walkers is assumed to correct itself if the 
statistics is increased from one iteration to the next. 



multi-determinant expansion of the fixed-node ground-state 
wave-function directly from a DMC run. For many applica- 
tions, this expansion may already be sufficient. In Section 
II VI we present a cost function that allows the optimization of 
more compact trial wave-functions that match the fixed-node 
ground state. A formalism for wave-function optimization 
based on an effective DFT-like nodal potential is given. In 
Section [V] we apply and compare these methods to a model 
system that can be solved nearly analytically and demonstrate 
its convergence properties. In Section [VT] we propose a gen- 
eral algorithm based on the experience gathered solving the 
model. Finally in Section [VTIl we summarize and discuss the 
prospects of this method for application in large systems. 



II. SYSTEMATIC REDUCTION OF THE NODAL ERROR 
WITHIN DMC 

The importance sampling DMC algorithm, in the fixed- 
node approximation, finds the lowest energy^ E® MC among 
the set of all wave-functions that share the nodal surface 
St(R) where the trial wave-function 'I't(R) = and 
changes sign. The symbol R denotes a point in the many -body 
3N dimensional space of electron coordinates. We denote this 
wave-function f j?jv (R) as the fixed-node ground state. It can 
be shown that vp^-jvCR) corresponds to the ground state of 
the interacting Hamiltonian containing an additional infinite 
external potential located at the nodes of vffj^R). 

The gradient of the fixed-node ground-state wave-function 
^fa^R) can be discontinuous at the nodal surface ^(R)-— 
Indeed, if the nodes of the trial wave-function do not cor- 
respond exactly to the nodes S'(R) of an eigenstate of the 
Hamiltonian, the Laplacian of the fixed-node ground-state 
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wave-function must have a S(R) contribution at least on part 
of 5t(R)- Otherwise, since the time independent Schrodinger 
equation is satisfied elsewhere by "J^a^R) with an energy 
E® MC , without this delta in the Laplacian at the nodal sur- 
face, v]>i?Ar(R) would be an eigenstate of the Hamiltonian. 
This implies that the gradient of "Jfa^R) must be discon- 
tinuous at least at one point of 5t(R) if the nodal surface 
5 T (R) S(R). 

In Fig. QJ we show a schematic representation of the trial 
wave-function ^/ T (R), the ground-state wave-function ^(R) 
and the fixed-node ground-state "J^a^R). In this section we 
show that when this kink in ^fa^R) is locally smoothed 
away as 



* T (R) = / dR'* FAr (R')5(R'-R) (1) 



the nodes of the resulting functions improve for a broad class 
oftf(R-R'). 

Provided that $ fn is an antisymmetric function with finite 
projection on the ground state ^q, it nas been shown^2£ that 
"J, and its nodes converge to the exact ground state 



* = lim e-W-^V 

t — >oo 



FN 



(2) 



the Gaussian is required to improve the nodes. In turn the 
Gaussian factor can be replaced by any other approximation 
of the 6 function as long as it does the same to the nodes of 
^FAf(R) as some Gaussian for small r. 

In order to determine the class of smoothing functions that 
move the node in Eq. ([TJ as a Gaussian, we consider a patch 
dS(R) of the nodal surface S*t(R) centered at Rs with a di- 
ameter small enough (so that it can be considered a flat hyper- 
plane) but much larger than y/r. The integration of the 3N di- 
mensional Gaussian in the directions of the hyper-plane leads 
to a one dimensional Gaussian G(x/y/r) = e~ x / 2t /\J2-kt. 
Any approximation of S(R) after integration in 3N — 1 co- 
ordinates should result in a function d(x) that can be rescaled 
and translated to satisfy the following properties: 



xd(x)dx = and / x 2 d{x)dx = 1 



d{x)dx 

(5) 

In the immediate vicinity of Rs, the function ^fa^R) de- 
pends only on the coordinate in the direction normal to the 
surface ns defined as x = (R — Rs) • ns = AR ■ ns. For 
n Owe can approximate 

* FA r(Rs + AR)~* FJV (Rs) + cia; + A:i|a;| + 

+c 2 x 2 + k 2 (x~ \x\) 2 +0(x 3 ) (6) 

and 



where H is the Hamiltonian and Et is an estimate for the 
ground-state energy. Setting t = Mr in Eq. (f2| yields the 
equivalent equation 



* = lim | e 



-t(H-E t ) 



M 



FiV 



(3) 



In the limit of small r a real-space linear-order expansion of 



e -T(H-E T ) takes the form 



*(R'-R) = (2^T)^e-^ R )-^e 
* J2^ T(En - ET) \^n)(% 



(4) 



where K(R) is the potential energy term (including interac- 
tions) in the Hamiltonian and the E n are eigenvalues of the 
eigenvectors *5> n . Replacing the first line in Eq. (O in Eq. <£XJ> 
we obtain a function ^y(R) that has, by construction [see Eq. 
(O second line], an energy less than or equal to the energy of 
V FN (R!) (being equal for 5 T (R) = S(R)). This form of 
trial wave-function is similar to a shadow wave-function.^ 2 ^ 
If we could evaluate Eq. (Q~|) analytically 2 - and use the result 
i&T (R) in a new DMC run, we would obtain a new fixed-node 
ground-state wave-function with an even lower DMC energy. 
This implies that the nodes of (R) are better than the ones 
of * T (R). 

Note that Eq. (O tends to the Dirac S function as 

S(R) = (27rr)- 3JV / 2 e-( R - R ') 2 / 2T for t -> 0. The factor 
e -T(v(R)-s T ) m £q ^ Qes not a j ter tne no( j es: it is a pos- 
itive scalar function (only acts as a branching term in a one 
time step simulation). Accordingly, to linear order in r, only 



— *fjv(Rs + AR) ~ a + fcisign(a;)+ 
dx 

+2c 2 x + 4 k 2 (x - \x\) + Q{x 2 ). 



(7) 



In Eq. © the wave-function is expanded as a combination of 
a smooth function (with coefficients c\ and c 2 ) plus a kink 
(ki and k 2 ). Replacing Eq. (|6]l and Eq. (Q into Eq. (H) and 
replacing the Gaussian by a generic approximation of 8{x) = 
d(x/y/r)/y/f we get: 



* T (Rs) = kiA[d\y/r + 0(t) 



(8) 



and the first derivative 



Cl 



kiS[d\ + 4k 2 A[d}V^ + Q(t) (9) 



where A[d] = J \x\d(x)dx and S[d] = J sign(x) d(x)dx. 
Note that if d(x) has the Gaussian form A[G] = \/2/ir > 
and S[G] = 0. Using Eqs. ([8]) and (0 we can estimate the 
displacement of the node to be 



Ax 



hA[d] 



Q(t). 



(10) 



Therefore, for any symmetric approximation of the 5 function 
S[d] = 0, provided that A[d] > 0, one can obtain the same 
displacement in the node as a Gaussian with t' = nA[d] 2 T /2. 
For a non-symmetric d(x), the node will move in the same 
direction as long as the sign in the denominator of Eq. ( fTOb 
does not change. However, a uniform rescaling of r to match 
the Gaussian form will no longer be possible. That means 
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that the node will move faster towards the exact node in some 
regions of the surface than in others. 

Thus, as long as the approximation of the delta used for 
smoothing is a function of the distance only, with A[d] > 0, 
one can find some Gaussian that moves the node in the same 
way for every patch dS(R). This movement corresponds to a 
better node. The restrictions in d(x) can be alleviated by using 
a repeated convolution. Using the central limit theorem it can 
be shown that a recursive convolution of any approximation 
of d(x) tends to a Gaussian as long as the Taylor expansion 
of its Fourier transform exists. Thus if the shape of d(x) is 
not known the method would be more stable if it is applied 
sequentially. 

In section Hn] we will use a smoothing function of the form 

5 (R, R') = ^2 ^nO^^raCR')) (ID 

n 

where the $„(R) are continuous functions without kinks 
forming a complete basis and the "~" in means that only 
some elements are included in the sum (with a criterion de- 
scribed below). If the $„(R) in Eq. dTTb are obtained from 
a non-interacting problem and the criterion for truncation is 
an energy cutoff, it can be shown that the resulting function 
is only a function of the distance (R — R') 2 . Since in that 
limit only plane waves of large energy are added to Eq. ( fTTT > 
and all the lower plane waves are included in the lower en- 
ergy components, the basis can be transformed with a unitary 
transformation into a plane-wave basis with a spherical cut- 
off in reciprocal space. If there is the same number of plane 
waves in any direction the results of Eq. (TTTT) only depend on 
the distance which implies that S[d] = 0. 

Since we restrict the sum in Eq. ( fTTT i to fermionic antisym- 
metric <t>„(R), Eq. ( fTTT i expands an antisymmetrized deltaS. 
This form projects out any non-fermionic component intro- 
duced in the wave-function along the DMC algorithm as in 
the A-function approach used by Bianchi and collaborators, 26 

In Section [TV] we propose a simple interpolation scheme 
to smooth the node where the expansion used in Eq. ( fTTT i is 
not taken to the high energy cutoff limit. The fact that these 
smoothing methods work in practice suggests that the con- 
ditions to improve the nodes are extended beyond the exact 
equivalence to a Gaussian form. 

Note that a discontinuity of the gradient of the fixed-node 
wave-function $ fn (R) at the node implies 2 ^ that, if walkers 
are distributed according to \I> fn (R) with the sign (or phase) 
of f™(R.)i there will be more walkers in the vicinity of one 
side of the nodal surface than on the other. Accordingly, if 
these walkers are released in a pure diffusion algorithm^ for 
t — > they will cross, on average, more from one side of 
the nodal surface than from the other. The nodes defined by 
the population of these signed walkers 2 ^ would move in the 
same direction that would result from smoothing the kink in 

fn (R) provided the time step is short enough and kinetic 
energy term in the Green's function [Eq. (01] is dominant. 
Consequently, the nodes can be improved by moving them in 
the direction of least "walker pressure" within a pure diffusion 
approach. 



Any method to obtain fn (R) from the walker distribu- 
tion in a DMC run 2 ^ will carry the error of statistical fluc- 
tuations from using a finite sample of walkers. Even if 
the ^j7Ar(R) is forced to remain antisymmetric^ the nodes 
might move in the wrong direction because of these fluctu- 
ations. We assume the method is robust against these ran- 
dom fluctuations when applied recursively, and can form the 
basis of an optimization process to improve the trial wave- 
function. Note that if incorrect fluctuations increase the kink 
in $FAr(R) at the node, the probability to sample the cor- 
rect fixed-node wave-function will remain higher and also the 
probability to move the node in the correct direction in suc- 
cessive iterations. Conversely, fluctuations that correctly im- 
prove the nodes will be reinforced 2 ^ in successive iterations. 
Since these fluctuations are reduced when the statistical sam- 
pling is improved, the nodal surfaces will converge to the true 
nodes if the statistics is improved from one iteration to the 
next (Fig.QJ). Note that we do not claim that this process is 
necessarily the most efficient optimization approach: more so- 
phisticated iterative methods and optimization algorithms are 
clearly possible. 

Summarizing, we should be able to improve the nodes sys- 
tematically provided we can obtain the anti-symmetric func- 
tion ^>fn(T\) from the walker configurations (probability 
distribution) of a DMC calculation after convolution with a 
smoothing function^ 

III. DETERMINATION OF THE FIXED-NODE 
GROUND-STATE WAVE-FUNCTION FROM THE DMC 
PROBABILITY DISTRIBUTION 

A. Sampling the fixed-node ground-state wave-function 

The distribution function of the walkers in an importance 
sampling DMC algorithm is given by£ 

/(R) = *5>(R)* FJV (R) (12) 
1 Wc 

= lim — V^fR-R,) 

i 

where 'J't(R) typically has the Slater- Jastrow form 

<J/ T (R) = e J(R) $ T (R); (13) 

in which $t(R) consists of a single determinant for each 
electronic spin component composed of single-particle or- 
bitals. The results of this paper are also valid if $t(R) has 
a more general form such as consisting of multi-determinant 
expansions for each spin component and/or containing back- 
flow or two-particle pfaffians. The R; in Eq. ( fT2b correspond 
to the positions of an equilibrated ensemble of N c configura- 
tions in a DMC algorithm (we have set the weights equal to 
one for simplicity). 

We note that $j?jv (R) in Eq. (fT2l can be rewritten as an 
antisymmetric function times the Jastrow factor as 

* FJV (R) = e J(R) e- J(R) * FJV (R) 
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= e'C R )$> n (R|(nctn>)l*T) 

n 

= e J ( R >^A„4> n (R) (14) 

n 

where A„(FJ c ^ Yl c )\^t) is a complete configuration in- 
teraction (CI) expansion in the basis of electron-hole pairs . 
Accordingly, in Eq. (TBI the $„(R) are Slater determinants 
or pfaffiansS obtained from replacing in $t(R) some of the 
occupied (\> v single particle functions by unoccupied <p n func- 
tions, accordingly J dR$* (R)$ m (R) = <5„ >m . 

In practice, the CI expansion can be truncated retaining, for 
example, only the $ m (R) with a non-interacting energy be- 
low a given energy cutoff. The CI expansion in principle con- 
sists of all single, double, triple, quadruple and higher-order 
excitations. By analogy with conventional CI calculations, the 
higher-order excitations are expected to contribute less to the 
wave-function than low-order excitations. As the kinetic en- 
ergy of higher-order excitations increases as compared with 
the interaction, their contribution to the ground-state wave- 
function decreases. 

While a Jastrow factor e J ^ R - ) is not formally required in 
a complete expansion of the wave-function in Eq. (fT4l . it is 
believed that the introduction of a Jastrow factor limits the 
number of coefficients required in the multi-determinant ex- 
pansion, due in part to the more efficient description of the 
electron-electron cusp. For some applications it may be desir- 
able to not employ a Jastrow factor, since the extracted wave- 
function may be more easily used in later analysis. 

Replacing Eq. ( TBi i and Eq. ( fT3b in Eq. ( fT2b we obtain 

/(R) = e 2J ( R )$^(R) AA(R). (15) 

n 

Borrowing a method from Optimized Effective Potentials 
(OEP) we define the following projectors' 2 ^^ 



e»(R) 



,-2J(R 



,$n(R) 
$ T (R)' 



(16) 



Note that the projectors £ n (R) are symmetric (bosonic) 
functions.— Replacing /(R) by Eq. $15[ , using the definition 
of £ n (R) [Eq- ( TTSl l and the orthogonality condition it can be 
demonstrated that 



J dR/(R)C(R) = K 



(17) 



Thus, the coefficients of the multi-determinant expansion 
Eq. (TBI of the fixed-node DMC ground-state wave-function 
can be estimated directly as a sum over the total number of 
walkers N c along the DMC random walk, using the second 
line of Eq. (fT2] i. as 



(An 



1 



_ge;(Ri)7(Ri) 

c i— 1 



where 



7(Ri) 



-1 + t/1 + 2|v| 2 t 



with v 



V*r(Ri) 
*t(R 4 ) ' 



(18) 



(19) 



For convenience we divided by the number of walkers N c 
in Eqs. ([T2l and (fT8l since the normalization constant of 
^i?7v(R) and the corresponding coefficients A„ is arbitrary. 
The factor Y(Rj) in Eq. ( TT8l is a time step, t, correction de- 
rived following Ref. [30J that corrects the divergences of the 
projectors £ n (Ri) at the nodes. This correction is not always 
applied to estimators (e.g. the local energy) but we find that 
it reduces the error of the wave-function coefficients. For an 
uncorrected sample of walker configurations the error bar of 
the multi-determinant expansion can be determined from 



(A«) 
(o-n) 

An 



1 N ° 

— ]T|aR)| 2 7(R*) 2 (20) 



i=l 



(A») 2 -(A 2 ) 

Nr. 



(A n )± 



1 



As N c — > oo in Eq. ( |20b the error bar in the multi-determinant 
coefficients X n goes to zero. As usual, the error bars can be 
used to monitor convergence of the calculation. While the 
eventual goal is to obtain small error bars, we found in practice 
it is better to start with N c small and then to slowly increase 
it with each iteration as the trial wave-function improves (see 
below). 

By substituting Eqs. ( fT2b . dot , and ( [TBI ) into Eq. ( fT71 ) and 
defining the fixed-node function $fw in terms of the trial 
function Jastrow and the fixed-node wave-function ^> FN 



V FN (R) = e J(R >$™(R) 
one can obtain this expression 

A n = J dR$;(R)$™(R) 



(21) 



(22) 



for A„. We define \Pt(R) to be the truncated expansion (de- 
noted using ~) of Eq. (fT4l 

* T (R) =e J(R) ^A„$„(R) . (23) 

n 

Substituting Eq. (l22l into Eq. ( 1231 yields the equation 



§r(R) 



dR 



5> n (R)$*(R') 



^fjv(R') 



(24) 

In Section [TT] we showed that the appearance of a smoothing 
function of the form of Eq. ( fTTT i as in the term in brackets in 
Eq. d24l > will smooth the nodes of $^jv(R') yielding better 
nodes for ^j-(R). Since the $ n (R) are selected to be eigen- 
vectors of a non-interacting problem, highly localized features 
of &fn(R) would require components with high eigenval- 
ues. At the same time, resolving those details would require a 
large number of configurations to improve the statistics. Ac- 
cordingly, we truncate the expansion in Eq. (1231 to the coeffi- 
cients with relative errors smaller than 25%. Note that as the 
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statistics is improved, the error bars diminishes, the number 
of functions retained in Eq. (fTTb increases and so does the lo- 
calization of S (R, R'). Thus the conditions to improve the 
nodes systematically as described in Section|II]are reached as 
the statistics improves. 



B. Sampling the Jastrow factor 

Instead of expressing 'I'fa^R) as a product of the same 
Jastrow factor used in \£y(R) times a different multi- 
determinant expansion, one can choose to optimize the Jas- 
trow factor while using the same antisymmetric function 
$x(R)- It is eas Y to show that there is a symmetric bosonic 
factor that turns <£>t(R) into *j?jv(R) which is formally 
given by 



J(R) = ^FN(R) 
<E» T (R) 

Replacing Eq. ( TB) in Eq. d25l ) we find 

e J(R) = e JWJ2\ n 



(25) 



$n(R) 

<J?t(R) 
e 3J(R) E A «^( R ) 



(26) 



Note that the product e'^ (R) $ T (R) yields Eq. ([14). While this 
shows that the projectors £ n (R) could be used to improve the 
Jastrow factor, since they diverge for <£>t(R) — * 0, it is nec- 
essary to fit instead a continuous functional form using values 
away from the nodes where truncation and sampling errors 
play a dominant role (see SectionlTV). 

Updating the multi-determinant expansion of the antisym- 
metric part of the new trial wave-function, see Eq. ([23), alters 
the nodes because (i) the expansion is truncated and (ii) the 
coefficients of the multi-determinant expansion have a random 
error due to finite sampling in Eq. ( TT8") . On the other hand, up- 
dating the Jastrow factor, see Eq. (f26), keeps the nodes fixed 
but reduces the number of determinants required and the over- 
all computational cost. There is a compromise between ac- 
curacy and speeds A very good wave-function might have a 
very small variance in the local energy, but if it is expensive 
to evaluate one might obtain the same statistical error in less 
wall-clock time with a faster lower quality wave-function. In 
an ideal case, if the nodes are u-representable (see below and 
Ref. f]~8h only a single determinant is required to describe the 
fixed-node ground-state wave-function to sufficient accuracy. 
In practice, the form of the Jastrow factor e J ( R > is unknown, 
while an infinite multi-determinant expansion is infeasible. 
This implies that both the factors in Eq. (TBI are required in 
general; an efficient scheme will optimize both the Jastrow 
factor and determinantal part of the wave-function. Particu- 
larly for the case of a metallic system, the cost of a multi- 
determinant expansion might be prohibitive due to the large 
number of low-energy excitations. In this case it might be 
preferable to concentrate on an optimized Jastrow factor^ 



C. A simple self-healing DMC algorithm 

We have formulated, for small systems, a working iterative 
algorithm based on a multi-determinant or multi-pfaffian ex- 
pansion of the fixed-node ground-state wave-function. In this 
algorithm the calculated coefficients Eq. ([T8) of the expansion 
are used to form a new trial wave-function defined by Eq. ( [23) . 
Initially the statistical errors present in X n due to finite sam- 
pling appear to have a beneficial role, particularly when the 
initial trial wave-function has poor nodes. Note that in the 
limit of an infinite number of determinants in Eq. (123) with no 
statistical sampling errors in X n the trial wave-function would 
exactly reproduce the fixed-node wave-function, and an iter- 
ative improvement of the nodes would not be possible. Sta- 
tistical fluctuations in the coefficients A„ allow the nodes to 
move. In the next iteration regions near beneficial fluctua- 
tions are revisited by walkers while bad statistically insignifi- 
cant fluctuations tend not to propagate or grow. This stability 
against random noise appears to be valid in practice. Thus, 
the statistical error in the coefficients plays the role of a ran- 
dom thermal fluctuation in a simulated annealing algorithm. 3 ? 
It is ironic and remarkable that random errors can be used to 
eliminate systematic errors. 

While it is relatively economical to calculate a large number 
of multi-determinants every autocorrelation length, as more 
determinants are included in the trial wave-function each time 
step of the DMC calculation becomes more demanding. Ac- 
cordingly, for large or continuum systems a method to min- 
imize the number of determinants used to represent a given 
nodal surface is required. This is described in the next sec- 
tion. 



IV. DERIVATION OF THE BEST NODAL-EFFECTIVE 
POTENTIAL FROM DMC 

While a working multi-determinant algorithm can be con- 
structed on the basis of the multi-determinant expansion of 
the previous section, a significant step forward can be taken 
using the theory developed in Ref. 18 and taking advantage 
of Eq. ([12) to construct a new trial wave-function that can be 
evaluated more efficiently than the multi-determinant expan- 
sion Eq. ( 123) . This method will be most effective when the 
initial single particle orbitals involved in $x(R) are poor, e.g. 
if the system is strongly correlated. 

A. A cost function for the DMC algorithm 

Given a probability density p(R) and a binned statistical 
sample of N c configurations of the random variable R, we 
can define a new random variable 



JV c fiip(Ri) 



(27) 



which is distributed by the Chi-squared distribution 
function.— In Eq. (|27) £1; is the volume of the bin i, 
with n.j configuration counts, p(R^) is the average of p(R) in 
Hi and M is the number of bins. 
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Each term in Eq. (l27T i is the square deviation of n, divided 
by the expectation value of the mean. In the limit of large 
counts the square of the mean is expected to be equal to the 
square deviation for the Poisson distribution of counts in a bin. 
Accordingly, in \ 2 relative deviations from the mean have 
the same impact independently of the absolute value of the 
probability density. We will take advantage of this property 
to replace a wave-function difficult to evaluate Eq. ( fT2l ) by a 
simpler approximate one that retains key properties. Setting 
i%i = N c &li<j(Ri) in Eq. ( |27] i, dividing by N c taking the limit 
M — > oo, and using the mean value theorem, we find a cost 
function to compare two continuous distribution functions: 



K v 



dR 



[g(R)-p(R) 
p(R) 



(28) 



We showed in Ref. [Ts] that if we wish to preserve proper- 
ties, other than the density, cost functions can be defined relat- 
ing the many -body ground-state *(R) with a non-interacting 
wave-function $r(R). The walker distribution function 2 
given by Eq. (fT2l allows us to construct several cost functions 
relating the wave-function to optimize with the exact fixed- 
node ground-state ^fn(R). Using Eq. (l28l l as a guide, we 
propose the following expression: 



K 



DMC 



dR 



p tf r (R)tt T (R) - /(R) 



H * T (R)* T (R) 

e(f(K)- v ), 



(29) 



where \£y(R) is a trial wave-function to be optimized, p = 

1 _1 

J* r (R)$ T (R)dR , / (R) is given by Eq. O with co- 
efficients obtained from a previous DMC run using Eq. ( TT~8b . 
9(x) is the Heaviside function, and 77 is a small positive n 
umber. Note in Eq. (1291 that the first factor vanishes when 
*t(R) — * *fjv(R)- Indeed, if § T (R) is constrained to 
have the nodal surface S't(R) and the sign (or phase) of 
^t(R), the integral of the first factor in Eq. (l29l measures the 
probability that the distribution of a given ensemble of walk- 
ers /(R) corresponds to the distribution 



,20 



a(R) = ^*t(R)*t(R) 



(30) 



In Eq. ( |29l , we add an absolute value function in the de- 
nominator of the first factor and a Heaviside function in or- 
der to extend the set of ^y(R) where the cost function can 
be evaluated beyond the fixed-node space. Note that, since 
/(R) > 0, while negative values for a(R) are allowed, they 
are penalized in the numerator more than positive values. In 
Eq. ([29), we add p to enforce / a(R)dR = / /(R)dR for 
any ^t(R)- In Eq. d29l the nodes of 4't(R) can move within 
a distance [which depends on 77 and /(R)] around Sr(R). 
Otherwise, if the zeros of the numerator and denominator of 
Eq. d29l do not match, the value of the cost function would 
rise to infinity. An additional effect of 9 is that any kink of 
^>fn(R-) at the node is not enforced by the cost function in 
^r(R). Since ^t(R) will be obtained from the minimum 



energy solution of a non-interacting problem^ and departures 
at the node are not penalized, it will interpolate smoothly 
avoiding a kink. Note that we can choose alternative cost 
function forms. For example, we can replace the denominator 
in Eq. d29l by /(R). This choice would simplify the deriva- 
tives of the cost function but it has a couple of disadvantages: 
First /(R) is expected to be a very noisy function when its 
magnitude is small, while the product of non-interacting v- 
representable wave-functions a(R) = p ^t(R) v I't(R) is 
expected to be smooth (see llV Bl > . We choose not to amplify 
the noise of /(R) in the denominator. Second, in Eq. ( |29l a 
small number for a(R) outside the window defined by the 
Heaviside function is highly penalized which confines the 
node of 4'r(R) to remain inside the window where the Heav- 
iside function is zero. 



B. Representability of the nodal surface 

Given an interaction in a many-body system, the 
Hohenberg-Kohn theorem 33 establishes a functional corre- 
spondence between electronic densities p(r), external poten- 
tials V(r), and ground-state wave-functions 4 r (R). The sub- 
set of densities p(r) corresponding to a ground state of an in- 
teracting system under an external potential V(r) are denoted 
as pure state w-representable.— A non-interacting pure state v- 
representable density is given instead by p(r) = J2 V ( r ) 1 2 
where <j) v (r) are Kohn-Sham-like- single particle orbitals, or 
eigenvectors, of the single-particle Hamiltonian: 



1 , 
-2 V ' 



V(t) 



4> y (r) = e v <\> v (r) . 



(31) 



where V (r) is an effective single particle potential. The low- 
est energy Slater determinant constructed with the solution of 
Eq. (|3TI ) is a many-body non-interacting ground state. For 
simplicity we denote those quantities that are simultaneously 
interacting and non-interacting w-representable as simply v- 
representable. In addition, certain quantities can be multi- 
determinant w-representable, meaning that they can be rep- 
resented by a finite multi-determinant expansion constructed 
with the solutions of Eq. < [3TT ). Since, the ground-state den- 
sity p(r) determines the ground-state wave-function ^(R), 33 
p(r) defines also the points R of the nodal surface S(R) 
where ^(R) = 0. The nodes of the trial wave-function, in- 
stead, are by construction those of <£>t (R) (non-interacting v- 
representable in the single determinant case). The exact nodes 
S(R) may or may not be representable in this manner.— 



C. Optimization of the effective nodal potential 

The trial wave-function is often constructed with non- 
interacting orbitals derived from an effective potential [see Eq. 
Oj}], e.g. from Kohn-Sham DFT For the moment we will 
assume that "Pt (R) is given in the single determinant Slater- 

Jastrow form: ^^(R) = e J ^ R ^$T(R) (this derivation is ex- 
tended to multiple determinants or pfaffians in II V F| ). How- 
ever, for now, we assume that the node can move within all 
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the non-interacting u-representable set, which is a less restric- 
tive condition than the fixed-node approximation but implies 
accepting an error if 5(R) is not u-representable. 

In Ref. [TH we showed that, if the trial wave function de- 
pends on non-interacting orbitals in an effective potential [as 
in Eq. d3Tbl, the effective potential V (r) required to retain a 
given property is a function of the cost function K. To sim- 
plify formulae, discussion and notation we assume here that 
all wave-functions are real. The potential can be obtained by 
adding recursively the following correction: 



, 5K 54> v fV) 
) SV K (r) ' 



(32) 



where e is adjusted during the optimization. Replacing K by 

K D mc we get 



SKdmc / dW(R)e J(R) 



K (r') 



K (r') 



where 



W(R) = 



SK 



DMC 



<y* T (R) ' 



(33) 



(34) 



for which we obtain 



2A(RWR) - A(R) 2 . , 

W R = I >PM2 sign a R 

|a(R)| 2 

x [l-a(R)]/i* r (R)»(/(R)-»?), (35) 

with A(R) = /(R) - ar(R). Within first order perturbation 
theory 

5<t> v (r') _ £ ^(r)^(r)^ ^ ^ 



5Vif (r 

Replacing Eq. d33l and Eq. d36l l in Eq. J32t , we find 



dV KDMC (r) = £^2Y1 



Ur)M^ f (3?) 



= / dRVK(R)e J < R )$^ l/ (R) . (38) 



In Eqs. (El, d36j, and (EJ we used El (En) to define 
sums over occupied (unoccupied) states. In turn in Eq. (l38l 
$^ i/ (R) means replacing the occupied state <j> v by <j> n which 

results from combining the cofactors of <j) v (r') [ /^y) ] in 
Eq. d33l with <j> n (r') in Eq. (f36b . The first factor in func- 
tion W(R) [Eq. (f35T >1 is obtained from the derivative of the 
cost function Eq. ( f29b with respect to a(R) [ignoring contri- 
butions coming from the discontinuities of \x\ since the Heav- 
iside function in Eq. ( f29T > is zero near the nodes]. The second 
factor in W(R) results from the derivative of a(R) with re- 
spect to 'I't(R)- [note that [i is also dependent on 4'r(R) 
1 



D. Optimization of the Jastrow factor within DMC 

We argued in the previous section that an optimal Jastrow 
factor can be used to reduce the number of determinants in the 
multi-determinant expansion. Optimizing the Jastrow factor 
is important to limit the exponential cost of the CI expansion 
because, while the Jastrow factor cannot influence the nodes, 
it can reduce the burden of correcting the probability density 
from any value given by a Slater determinant (see Eq. d25ll). 
Accordingly, if the Jastrow factor is optimized, the antisym- 
metric part of the wave-function is free to search for the nodes. 
Often the J(R) is dependent on a set of parameters j n . The 
value of the cost function (Eq. [29T > is also affected by the Jas- 
trow factor e J ( R \ Thus the gradient of the cost function with 
respect to an arbitrary change in e J ' R - ) can be obtained within 
DMC via 



dK DMC = i dw(R)e J(R)|, T(R) 



dJ(R) 



(39) 



E. Discussion 



Note at this point that (1) both the coefficients (3" and 7„ 
are integrals of the function VK(R) which is only dependent 
on the particular form of the cost function selected in Eq. (|29l 
and a representation of the walkers distribution /(R). 

(2) The function /(R) is an essential component of W(R) 
that can be obtained from the DMC run using Eqs. ( fT5l ) and 
( TT~8b or sampled directly by binning.— 

(3) Provided that /(R) is known, a distribution of 
configurations R^ with probability |W(R)| can be gen- 
erated with the Metropolis algorithm. All integrals of 
the form f dRg(R)W(R) involved in Eqs. ([37) and (09} 
can be evaluated in a single correlated sampling step as 
Ei sign[W(Ri)]<7(Rj) using points R; drawn from the prob- 
ability distribution defined by the absolute value of VF(R). 

(4) In most methods, the Jastrow parameters j n are opti- 
mized within a variational Monte Carlo approach (either min- 
imizing the total energy or the energy variance). Here we op- 
timize them within a DMC run. The role of the Jastrow factor 
within this approach, is different. Its role instead is to correct 
the trial wave-function $t(R) to match $>fn(TL). The opti- 
mization of the Jastrow parameters with Eq. d39l only ensures 
that the cost-function Eq. ( |29l is minimum. Optimization of 
the Jastrow factor is required to allow the antisymmetric part 
of the wave-function to move the nodes while the Jastrow fac- 
tor takes care of the symmetric contribution. However, if the 
variational freedom of the Jastrow factor or the statistics are 
limited, the minimization of Eq. d29b does not necessarily im- 
ply a minimum in the VMC energy or its variance: the vari- 
ance of the local energy might rise. In those cases the Jas- 
trow factor must be optimized twice: first when the potential 
is optimized and second during a VMC variance minimization 
before a collection DMC run. 

Finally, (5) note that § T (R) and <J/ T (R) have different Jas- 
trow factors (^t(R) is kept fixed during the cost function op- 
timization steps). 
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F. Optimization of multi-determinant wave-functions 

The multi-determinant expansion obtained in this subsec- 
tion is different from the one obtained in Section [III] In Sec- 
tion|III]we found a multi-determinant expression of ^jtjv-(R) 
in a given non-interacting orbital basis set for a given fixed 
Jastrow factor. Here we optimize the Jastrow factor and the 
non-interacting basis to match $fjv(R) within a prescribed 
small number of determinants. 

If we restrict the search to pure-state non-interacting v- 
representable nodes, the minimum energy Edmc will be 
larger than the true ground-state energy E[p(r)], because of 
the upper-bound theorem, unless 5(R) is v-representable. 
In DMC the w-representability constraint is not required and 
can be partially removed by including multi-determinants in 
<&t(R) giving more variational freedom to the nodes. 

Note that if we express "f^R) as a multi-determinant ex- 
pansion of the form 



* T (R) = e J ( R) 5> fc l> fe (R), 



(40) 



an equivalent expression for wave-function optimization can 
be found. The sum over occupied (unoccupied) levels in 
Eq. (l32l must be extended to every orbital that is occupied 
(unoccupied) in $fc(R). Also, it is easy to prove that the only 
change in Eq. ( |37| | required is in the values of the /3" which 
must be replaced by 



0u 



J cHW(R) 



**(R), (41) 



where the operators c' n and c v change, when possible, the sin- 
gle particle state (\> v with (f> n in the Slater determinant $j.(R); 
and give zero if <f) l , is not included or 4> n is already occupied. 
The function VK(R) is still given by Eq. 135) . The coefficients 
ctk can be optimized using the following expression 



<1Kdmc 
da k 



= / dRW(R)e J ( R >$ fe (R) 



(42) 




V. MODEL SYSTEM TESTS 



20000 40000 60000 80000 100000 120000 
Number of DMC steps 

FIG. 2: (Color online) Self-healed DMC run obtained using the 
method described in Section [Til] Black points denote the average 
value of the local energy for each DMC step. Green points mark the 
reference energy used for population control. Orange lines mark the 
average energy of the trial wave-function. The horizontal blue line 
marks the energy of the ground state in the full CI calculation. Ver- 
tical lines mark the steps when the coefficients of wave-function are 
updated. Inset: Detail of the DMC run for the first 10000 steps (same 
conventions as in the main figure). 



effective potential for the wave-function nodes. Briefly, 
we solve the ground state of two spin-less electrons mov- 
ing in a two dimensional square of side length 1 with 
a repulsive interaction potential of the formal V(r, r') = 
87r 2 7 cos [air(x — x')] cos [air(y — y')]. In this paper we 
show results for a = 1/tt and 7 = 4. With this choice of 
parameters the system is in the highly correlated regime, be- 
cause the matrix element of the interaction potential between 
the non-interacting ground state and first excited state is larger 
than the non-interacting energy difference. We expand the 
many-body wave-function in a full CI expansion of Slater de- 
terminants with the same symmetry as the ground state. The 
ground state is degenerate because there are only two elec- 
trons. We choose one of the ground-state wave-functions ac- 
cording to the D-2 subgroup of the D4 symmetry of the Hamil- 
tonian. For more details see Ref. [TH From the full CI calcu- 
lation we obtain a nearly exact expression of the ground state 
$(R) = E„fl„$„(R). 



In this section, to demonstrate the methods described 
above, we solve a simple yet non-trivial interacting model as 
a function of the interacting potential strength and shape. We 
then test a simple version of the algorithm described in Sec- 
tion [TIT] Subsequently, we replace the model interaction by a 
realistic Coulomb interaction. Finally, in subsection IV Dl we 
optimize the wave-functions by obtaining the effective nodal 
potential, as described in Section ITvl 

A. A model interacting ground state 

For illustrative purposes we choose the same problem 
studied in Ref. Ilia where we derived the existence of an 



B. Projection of the DMC fixed-node wave-function on a 
multi-determinant expansion 

In order to facilitate the comparison with the full CI results, 
we sample the mixed-estimator density with the projectors 
£„(R) constructed using the same basis functions of the CI 
expansion. For the same reason, we utilized no Jastrow func- 
tion (J = Oin Eq.[T6b. 

An initial trial wave-function must be selected. While the 
non-interacting solution has very good nodes, 11 ^ we intention- 
ally chose a poor initial trial wave-function in order to test the 
strength of the multi-determinant method described in Sec- 
tion [TIT] The worst case scenario is when the trial wave- 
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function is orthogonal to the exact ground state. If the ex- 
act ground state is not included in the trial wave-function, a 
projector method such as the standard DMC algorithm can- 
not yield the exact ground-state energy. Accordingly, to test 
the method, we chose for this example Ai = 03, A3 = — a\ 
and A„ = for all remaining n.— Expanding, ^fnCR) with 
these A„ and replacing it in Eq. ([Tol l we obtain the projectors 
£ n (R). Next we obtained new values A„ sampling Eq. (fT8T i 
every autocorrelation time. After many configurations are 
sampled, we construct a new trial wave-function with the new 
A„. We only include in the wave-function the coefficients that 
satisfied the condition |A„| > 4-^==, i.e. that the coeffi- 
cients are well determined according to this empirical thresh- 
old. Note that because the multi-determinants are solutions of 
a non-interacting problem, they will tend to have more nodes 
as their energy increases. Accordingly, high energy compo- 
nents of the wave-function will have smaller coefficients (A„) 
in absolute value as compared with the error (<r„). As a con- 
sequence, this acceptance threshold removes the contribution 
of the high energy components which implies that the result- 
ing wave-function will be smoother than $j?jv (R-) without the 
kinks at the nodes. This process is the core of a more complex 
algorithm we propose for larger systems that is explained in 
Section[VT](see steps 3 and 4). 

The result of this iterative approach is summarized in 
Figs. 12 [3] IH and [5] In Fig. [2] we show the average of the 
local energy El (black dots) and the best estimator for the 
energy Ebest (green dots) 30 as a function of the number of 
DMC steps. The average energy of the trial wave-function 
E = (^t\H Y&t)/ (^tI^t) (orange) is also given for com- 
parison. The run was carried out for a targeted population 
of 200 walkers. The exact full CI result is given by the blue 
line. There is a dramatic decrease of El, Ebest and E as the 
trial wave-function is updated, and all these values converge 
to the full CI result. Similar results are obtained with differ- 
ent starting points and interaction strengths. The only limiting 
factor to reaching the exact CI results appears to be the itera- 
tion time. The reduction in the energy variance can be seen in 
Fig. [2] where the fluctuations in the local energy decrease as 
the run continues. 

In Fig. [3] we show a plot of the values of the full CI coef- 
ficients as a function of the coefficient index compared with 
the average values obtained from the optimized trial wave- 
function and a final DMC run using Eq. < fT~8b . The coefficients 
are ordered with increasing non-interacting energy. The error 
bars of the coefficient are also given. The figure shows that 
a wave-function expansion with the quality of a CI expansion 
can be obtained with DMC. Note that (i) knowledge of the 
ground-state wave-function allows for the calculation of any 
other observable with an error bar that can be obtained from 
the error bars of the expansion coefficients, (ii) The same 
wave-function could be expressed with a smaller number of 
determinants if a Jastrow factor had been used. 

In Fig. [4] we show the evolution of the values of the full CI 
coefficients as a the algorithm progresses starting from a trial 
wave-function orthogonal to the ground state. 

The improved quality of the DMC optimized trial wave- 
function is also evident in Fig. [5] We plot the logarithm of the 
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FIG. 3: (Color online) Values of the coefficients of the multi- 
determinant expansion (small green circles) as compared with a full 
CI calculation (large black circles). The DMC statistical errors of the 
coefficients is equal to the radius of the green circles. 
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FIG. 4: (Color online) Change in the values of the multi-determinant 
expansion as the DMC self-healing algorithm progresses. Light gray 
colors denote older coefficients while darker ones denote more con- 
verged results. The initial non-zero coefficients are highlighted in 
red squares. 



residual projection 

R P = log [l-(tf|tf T )/(|tfp T |)] (43) 

on the "exact" CI ground state as a function of the logarithm 
of the total weighted number of configurations along the com- 
plete run N w . Remarkably, the error of the wave-function pro- 
jection has decreased to e~ 8 starting from 1. By noting that 
|* T > = I*) + |<^J_), where |<5*j_) is the difference between 
the ground-state |\I>) and the trial wave function \^t) we g et 



R P ~ 2 log \\8*±\/y/2 



(44) 



We can see that for a significant section of the run Rp ~ 
1/iV^, where N w is the total number of weighted configura- 
tions of the run. This means that the magnitude of the error in 
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FIG. 5: (Color online) Logarithm of the residual projection Rp [see 
Eq. J43H as a function of the total weighted number of configurations 
along the complete run N w . The lines are a guide to the eye. Inset: 
projection of the DMC self-healed wave-function onto the full CI 
ground state as function of the logarithm of N w . 



the trial function decays with a faster exponent than 1 / \/N w 
(3/2). This is surprising because if we had provided the ex- 
act ground state as trial wave-function, the error after finite 
sampling would have scaled as \8^!±\ ~ l/\/N w , which re- 
placed in Eq. (l44l i gives R P ~ 1/ y/N w . This faster exponent, 
in a section of the plot, is a direct consequence of the fact 
that both the quality of the trial wave-function and the statis- 
tics have improved. This is another indication that the nodes 
continue to improve along the run. For the final part of the 
graph (the last three points), however, Rp scales as l/^/N w . 
This possibly signals that after the nodal structure is improved 
to a critical distance from the exact ground state, the statis- 
tical error in the determination of the coefficients and not a 
small fluctuation in the nodal structure, is the limiting factor 
for this algorithm. We believe that a final 1 / \ / N w scaling of 
Rp signals also that the overall nodal structure of the solution 
is correct and only small fluctuations of the coefficients are 
responsible for the small fluctuations from the exact node. 

Since a direct sampling of the fixed-node wave-function 
(Eq. ( fT"8l )) aims to reproduce the fixed-node solution, a single 
DMC run cannot improve the nodes. Only by iterating with 
different trial wave-functions can the nodes be improved. In 
particular, if an infinite number of configurations were used, 
the nodes would not change. In practice however, we find 
that for a finite sample, the error in the wave-function coef- 
ficients plays a positive role. Errors act as random fluctua- 
tions in a simulated annealing algorithm. These fluctuations 
are reinforced^ or discarded in subsequent iterations. This al- 
lows the nodal error to be systematically reduced to the point 
that trial wave-functions with 0.9995 projections on the full CI 
ground state can be found starting from a trial wave-function 
initially orthogonal to the ground state. Since poor nodes are 
associated with discontinuities in the derivative of '5 pn (R) at 
the nodal surface, and consequently an increase in the kinetic 
energy, it is also convenient at first to initially limit the number 
of configurations sampled (including first the ones that cost 
less non-interacting energy). 

We recognize that the current work does not address the 
suitability and convergence of this method of relying on ran- 



FIG. 6: (Color online) Energy of the DMC run as a function of the 
number of DMC steps used to gather statistical data of the wave- 
function in the previous block. The statistical error bars for the 
first three points on the left were not calculated. The statistical er- 
ror bars of the points on the right were smaller than the size of the 
symbols. Blue squares denote calculations starting from a bad trial 
wave-function, while the red circles mark the results obtained from 
an initial trial wave-function corresponding to the best blue square 
on the right (see text). Green diamonds were generated starting from 
the best red circle. 



dom fluctuations for systems with large numbers of electrons; 
this will be the subject of later studies. 



C. Coulomb potential results and discussion 

The use of a simplified electron-electron interaction facili- 
tates the CI calculations and the validation of the optimization 
method described in Section[lIl| However, it is also important 
to test the convergence and stability of the method with a real- 
istic Coulomb interaction. Note that in two dimensions (2D) 
the correlations are enhanced as compared with three dimen- 
sions (3D) while the nodal surface remains non-trivial. 

We tested the stability of the algorithm by replacing the in- 
teraction potential withal V(r, r') = 20tt 2 / |r — r/| . Since 
the length of the square box side is 1, the difference in ki- 
netic energy between the non-interacting ground state and the 
first-excited state is 3tt 2 . This choice of parameters for the 
Coulomb potential placed the system in a strongly interacting 
regime. To further increase the role of correlations and the dif- 
ficulties that the algorithm must overcome we did not included 
a Jastrow term, i.e. J = 0. We also increased the chances of 
failure by setting the initial trial wave-function equal to the 
first excited state of the non-interacting system. 

In Fig. [6] we show the evolution of the average of the lo- 
cal energy for each DMC optimization block as a function of 
the number of DMC steps in each optimization block NpjMC- 
Data for Eq. ( fTSl ) is accumulated every 100 DMC steps. As 
in the case of the model Hamiltonian, we increase Ndmc m 
each optimization as Noaic = 200 x 2™ b / 2 where rib is the 
total number of blocks. With this choice we can expect the 
error bar in the energy and in the coefficient A„ of the multi- 
determinant expansion Eq. ( fT4b to be reduced a factor 1 /2 af- 
ter four successive blocks. Note that during each DMC run 
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not only the local energy is sampled but also the values of the 
projectors £(R) used to construct the expansion of the trial 
wave-function of the next point on the right with Eq. (Q~8j. 

The blue squares in Fig.|6]show the progression in average 
DMC energy starting from the first excited state. The initial 
energy is above 420 compared with the fully converged energy 
of 402.718 ± 0.008. Even starting from such a bad initial trial 
wave function, our method is able to improve in the second 
block after only accumulating « 400 configurations. In con- 
trast, the red circles in Fig. [6] denote the results obtained with 
an initial trial wave-function constructed with data collected 
with the right most blue square, a very good initial trial wave- 
function. 

As the optimization process is repeated, the average DMC 
energy fluctuates. Since the coefficients carry a statistical er- 
ror, the wave-function is not the same from one block to the 
other and neither is the nodal error. There is a shift from one 
iteration to the next which is sometimes larger than the error 
bar in the energy. The energy and the variance can fluctu- 
ate and locally increase. However, as the statistics improve, 
fluctuations in the coefficients decrease. The statistical er- 
rors play the role of a thermal noise in the coefficient ex- 
pansion. Improved statistics correspond to reduced temper- 
atures in simulated annealing. Note that, initially, the aver- 
age DMC energy from the very poor trial wave-function de- 
creases (blue squares) as the algorithm progresses, while the 
energy of the average DMC energy from the good trial wave- 
function (red circles) actually increases. This is because when 
the statistics are poor the errors in the coefficient expansion 
allows improvement of a bad trial wave function but spoil a 
good quality one. Figure [6] shows that, as the algorithm pro- 
gresses and improved statistics are obtained, the quality of the 
solution becomes independent the initial trial wave-function. 
Note that for intermediate blocks the DMC energy becomes 
flat, signaling that the the statistics are not enough to reduce 
the nodal error, but are sufficient to stop deterioration of the 
wave-function. 

Repeating the algorithm iteratively leads to an incremental 
improvement in the statistics which results in a clear reduc- 
tion of the DMC energy beyond the error bar of the preceding 
calculations. The DMC energy and the energy variance are 
reduced systematically which is a clear indication of the re- 
duction of nodal errors and improvement in the overall quality 
of the wave-function. The ground-state energy obtained after 
240000 accumulated DMC iterations is 402.718 ± 0.008. 

In Fig. [7] we show the values of the coefficients of the multi- 
determinant expansion as obtained with Eq. ( fT~8b correspond- 
ing to the right-most blue point in Fig. [6] Note that since no 
Jastrow factor is used and the interaction potential includes 
a singularity at r = r', the number of coefficients with sig- 
nificant value is much larger the model interaction described 
earlier. The final reduction of nodal errors shown in the fi- 
nal steps of Fig.|6]is associated with subtle variations of the 
coefficients. 

If the Jastrow factor is set to one, the density takes a sim- 
ple form (Eq. d45l )) in terms of the single-particle orbitals 
4> n (r). Knowledge of this density allows the calculation of 
the Kohn-Sham potential as explained in Ref. [l8| (see be- 
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FIG. 7: (Color online) Values of the coefficients of the multi- 
determinant expansion (small green circles) obtained from the DMC 
run for two electrons in a square box with a Coulomb interaction in 
the highly correlated limit. The statistical errors in the values of the 
coefficients are equal to the size of the red bar. 




FIG. 8: Density of the ground state of two spin-less electrons with 
Coulomb interaction in a square box. We choose one of the two de- 
generate ground states, reducing the symmetry of the density to Di. 
a) Left side of the density of the many-body ground state constructed 
with the converged coefficients shown in Fig. [7] b) Kohn-Sham non- 
interacting density constructed as explained in Ref.flU 



low) and suggests an alternative route for calculation of forces 
by applying^ the Hellmann-Feynman theorem directly to 
the Kohn-Sham total energy instead of the usual statistical 
sampling4ML The DMC density can be obtained in terms of 
the single-particle orbitals with the following equation^ 

p(r)=J2K(r)Mr)Y, X l X ^\ c n c ^) ■ ( 45 ) 

n,v k,l 

Note in Eq. d45l ) that all the matrix elements (<E> fc |c^c^ | ) 
corresponding to states that differ in more than one electron 
hole pair, do not contribute to the ground-state density. In 
Fig. [8^ we show the density corresponding to the coefficients 
of Fig. [7J and in Fig. [SJ) the non-interacting Kohn-Sham den- 
sity constructed using the methods explained in Ref.Qjl 

In Fig. [9] we show the Kohn-Sham potential obtained us- 
ing the methods described in Ref. We minimized the cost 
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FIG. 9: Kohn-Sham potential for two spin-less electrons in a square 
box corresponding to the ground state of Figs. UJ and [8] The potential 
was constructed using the methods explained in Ref.Ua. 

function in Eq. (2) of Ref. [TH using 14 Fourier components in 
the potential expansion. We believe that the sampled oscilla- 
tions in the Kohn-Sham potential carry some physical mean- 
ing. Indeed, these oscillations are required in order to match 
the non-interacting density in Fig. [SJ) to the interacting self- 
healed DMC density in Fig. [8^. However, since the density 
p(r) has an error u p {r), there is also an error in the Kohn- 
Sham potential. In linear response,— the error bar in the po- 
tential cr_R-s(r) (not shown) can be obtained in terms of o>(r') 
and the inverse susceptibility as 

f SV (r') 

a KS (r)= J dr>a p (r>)^-f . (46) 

Since, we have removed degeneracies in the ground state by 
restricting the symmetry of the wave-function, two potentials 
that give the same density can only differ by a constant. We 
have obtained from DMC not only the approximated DMC en- 
ergy but also the derivative of the total energy with respect to 
local fluctuations of the density. Figures[8]and[9]show that this 
method can provide accurate benchmarks for the validation of 
DFT approximations in the highly correlated regime. 



D. Model system effective nodal potential and Jastrow factor 

To demonstrate that the effective nodal potential and Jas- 
trow factor can be obtained through sampling in DMC, in this 
section we determine these quantities for a model correspond- 
ing to two electrons in a square box with Coulomb interac- 
tions. An additional goal is to show that a complex (multi- 
determinant) wave-function can potentially be replaced by a 
simpler one while retaining the same nodal structure. 

The results below correspond to a trial wave-function repre- 
sented using the multi-determinant expansion shown in Fig. [7] 
While for larger dimensional systems the integrals can be per- 
formed more efficiently using a stochastic approach, in this 
case the probability densities were binned numerically over 
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FIG. 10: a) Effective nodal potential, b) one-body Jastrow, and c) 
two-body Jastrow factors obtained by minimizing Eq. j29\ , in which 
the multi-determinant expansion of Fig. UJ has been replaced by a 
single determinant function. 



a grid of fifteen bins in all four dimensions. Approximately, 
7.2 x 10 5 weighted 4 ^ configurations were collected. 

The one-body and two-body Jastrow factors were simply 
written as a Fourier expansion and their coefficients were min- 
imized with an accelerated steepest decent algorithm using 
Eq. (f39b . The antisymmetric part of the wave-function was 
given by a single determinant corresponding to the ground- 
state solution of a non-interacting effective potential. The ef- 
fective interactive potential was expressed as a sum of cosine 
functions and optimized as explained in Ref. [lj| The Jastrow 
factors and the potentials can be optimized at the same time. 
However, since we wanted the Jastrow factor to carry most 
of the load in the optimization of the symmetric corrections to 
the probability density, the potential was optimized only every 
third iteration that the Jastrow factor was optimized. 

The resulting potential, and Jastrow factors are shown in 
Fig. [10] The value of the cost-function was reduced an order 
of magnitude from starting with the non-interacting ground 
state with zero effective potential. The effective potential re- 
sulting from this minimization procedure is an example of the 
nodal potential predicted in Ref. [ll| 

We also performed tests of this optimization algorithm us- 
ing the model interaction discussed in Subsection lVBI In this 
case the nodal structure of the wave-function was also im- 
proved (as signaled by a reduction of the average DMC energy 
below the error bar of the preceding calculation). 



VI. SUMMARY OF IMPROVED SELF-HEALING DMC 
ALGORITHM 

It is clear from previous sections that an effective wave- 
function optimization algorithm can be constructed solely on 
the basis of iteratively updating by the multi-determinant 
expansion of An example of this algorithm applied to 

a soluble model is presented in Subsection IV Bl However, 
multi-determinant expansions in DMC are computationally 
very expensive in large or continuum system, since the re- 
quired number of determinants to reach a given accuracy will 
in general grow combinatorially. The method developed in 
Section [TV] to optimize a single Slater determinant becomes 
very attractive. (Results of the application of this method were 
shown in Subsection IV D| ). For large systems, the number of 
multi-determinants must be kept to a minimum and the two 
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methods combined. Experimentation in small systems allows 
us to suggest an algorithm that will be efficient in larger sys- 
tems: 

1. An initial trial trial wave-function $t(R) is gener- 
ated using any fast method, e.g. an empirical screened 
pseudopotential^i or a Thomas-Fermi theory. 

2. The Jastrow factor J(R) is optimized within VMC. 

3. A DMC run is performed. The number of configura- 
tions N c sampled is increased as this step is repeated. 
Statistically uncorrected values of £„(R) and £„(R) 2 
are accumulated. 

4. The multi-determinant expansion of /(R) is con- 
structed. Only the terms that are significantly non-zero 
are included in the expansion. 

5. A distribution of configurations R^ with probability 
|W(R)| is generated. The gradients of Kdmc with 
respect to the effective nodal potential and the gradi- 
ents of the Jastrow factor coefficients are evaluated with 
Eqs. d37| ), and d42b . (Eventually the multi-determinant 
expansion coefficients oik can be included, see Subsec- 
tion UVFl ) 

6. The effective potentials V (r) and J(R) are updated 
(eventually also the oik)- New single particle orbitals are 
constructed using Eq. ( BTT i. Therefore the single particle 
orbitals used to construct the Slater determinants in the 
trial wave-function are now determined solely within 
DMC. 

7. A new ^j-(R) is constructed. Steps (5-7) are repeated 
until ^t(R) does not change. 

8. At this step we can choose to improve the scaling in 
large systems. The single-particle orbitals </> rl (r) shared 
by all determinants in the expansion $y(R) can be 
transformed to non-orthogonal localized orbitals 

9. The trial wave-function $t(R) is updated to $^(R). 
Steps (2-9) are repeated until \Pt(R) and Edmc do 
not changed 

Note that (i) the methods in Sections[ni]and[lV]are comple- 
mentary. In Section |nll we find a representation of the fixed- 
node ground state in a given basis. In Section |IV] instead, 
we optimize and change the basis of the wave-functions so as 
to reproduce the fixed-node ground-state wave-function with 
a minimum number of Slater determinants, (ii) Only single 
configurations are included in Eq. d37| i but multiple configu- 
rations are included in Eq. (fT4l . (iii) We include a Jastrow 
function in Eq. ( TBI to minimize the number of Slater deter- 
minants required in the expansion. However, a final run with 
no Jastrow factor included with the configuration interaction 
expansion might be useful in order to obtain a pure expres- 
sion of the ground-state density in terms of the single particle 
orbitals. Atomic forces could be obtained from this density. 
Finally (iv) the method is, in principle, self reliant: no DFT or 
HF are required. 



VII. SUMMARY 

We have presented an algorithm for sampling the fixed- 
node many-body wave-function in a single or multi- 
determinant expansion from a diffusion quantum Monte Carlo 
(DMC) calculation within the importance sampling technique. 
By combining this algorithm with a previously developed 
method for constructing effective potentials targeted at repro- 
ducing specific properties of the many-body wave-function, 18 
we presented an iterative algorithm that improves the nodes 
of the trial/fixed-node wave-functions used in DMC. Tests on 
a simple two electron model system confirm that this method 
is able to improve the nodes and that, at least in the case of 
the tested system, we find wave-functions and energies that 
exactly match fully converged configuration interaction cal- 
culations. 

We have proven that the nodes of the fixed-node wave- 
function improve as compared with the trial wave-function if 
the kinks at the nodes are locally smoothed out. The algo- 
rithms presented take advantage of this proof. We have argued 
that if the kink at the node increases with the "distance" from 
the exact ground-state node to the trial wave-function node, 
the algorithm should be stable against random statistical fluc- 
tuations. Proving this property in general might be difficult 
and is beyond the scope of this article. Clearly, in the absence 
of a proof, experimentation in larger systems is required. 

While in the past, methods were used to obtain the fixed- 
node wave-function (e.g Ref. [2?1) , to our knowledge this is 
the first time the fixed-node wave-function has been obtained 
through importance sampling. The availability of the fixed- 
node wave-function provides routes to determine the exact 
Kohn-Sham potential, allowing benchmark tests of density 
functionals in highly non-trivial and inhomogeneous systems. 
It also seems likely that many of the wave function optimiza- 
tion approaches (e.g. Refs. laTHll) currently applied within 
variational Monte Carlo can be recast in the present scheme, 
making direct use of the fixed-node wave-function, and likely 
obtaining improved results. 

In ongoing work, we are continuing to develop these meth- 
ods. Applications to larger and more complex electronic sys- 
tems will be reported elsewhere. 
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